208 research outputs found

    A Survey of Learning-based Automated Program Repair

    Full text link
    Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in software development and maintenance. With the recent advances in deep learning (DL), an increasing number of APR techniques have been proposed to leverage neural networks to learn bug-fixing patterns from massive open-source code repositories. Such learning-based techniques usually treat APR as a neural machine translation (NMT) task, where buggy code snippets (i.e., source language) are translated into fixed code snippets (i.e., target language) automatically. Benefiting from the powerful capability of DL to learn hidden relationships from previous bug-fixing datasets, learning-based APR techniques have achieved remarkable performance. In this paper, we provide a systematic survey to summarize the current state-of-the-art research in the learning-based APR community. We illustrate the general workflow of learning-based APR techniques and detail the crucial components, including fault localization, patch generation, patch ranking, patch validation, and patch correctness phases. We then discuss the widely-adopted datasets and evaluation metrics and outline existing empirical studies. We discuss several critical aspects of learning-based APR techniques, such as repair domains, industrial deployment, and the open science issue. We highlight several practical guidelines on applying DL techniques for future APR studies, such as exploring explainable patch generation and utilizing code features. Overall, our paper can help researchers gain a comprehensive understanding about the achievements of the existing learning-based APR techniques and promote the practical application of these techniques. Our artifacts are publicly available at \url{https://github.com/QuanjunZhang/AwesomeLearningAPR}

    GAMMA: Revisiting Template-based Automated Program Repair via Mask Prediction

    Full text link
    Automated program repair (APR) aims to fix software bugs without human intervention and template-based APR has been widely investigated with promising results. However, it is challenging for template-based APR to select the appropriate donor code, which is an important repair ingredient for generating candidate patches. Inappropriate donor code may cause plausible but incorrect patch generation even with correct fix patterns, limiting the repair performance. In this paper, we aim to revisit template-based APR, and propose GAMMA, to directly leverage large pre-trained language models for donor code generation. Our main insight is that instead of retrieving donor code in the local buggy file, we can directly predict the correct code tokens based on the context code snippets and repair patterns by a cloze task. Specifically, (1) GAMMA revises a variety of fix templates from state-of-the-art template-based APR techniques (i.e., TBar) and transforms them into mask patterns. (2) GAMMA adopts a pre-trained language model to predict the correct code for masked code as a fill-in-the-blank task. The experimental results demonstrate that GAMMA correctly repairs 82 bugs on Defects4J-v1.2, which achieves 20.59\% (14 bugs) and 26.15\% (17 bugs) improvement over the previous state-of-the-art template-based approach TBar and learning-based one Recoder. Furthermore, GAMMA repairs 45 bugs and 22 bugs from the additional Defects4J-v2.0 and QuixBugs, indicating the generalizability of GAMMA in addressing the dataset overfitting issue. We also prove that adopting other pre-trained language models can provide substantial advancement, e.g., CodeBERT-based and ChatGPT-based GAMMA is able to fix 80 and 67 bugs on Defects4J-v1.2, indicating the scalability of GAMMA. Overall, our study highlights the promising future of adopting pre-trained models to generate correct patches on top of fix patterns.Comment: Accepted to 38th IEEE/ACM International Conference on Automated Software Engineering (ASE2023

    A Critical Review of Large Language Model on Software Engineering: An Example from ChatGPT and Automated Program Repair

    Full text link
    Large Language Models (LLMs) have been gaining increasing attention and demonstrated promising performance across a variety of Software Engineering (SE) tasks, such as Automated Program Repair (APR), code summarization, and code completion. For example, ChatGPT, the latest black-box LLM, has been investigated by numerous recent research studies and has shown impressive performance in various tasks. However, there exists a potential risk of data leakage since these LLMs are usually close-sourced with unknown specific training details, e.g., pre-training datasets. In this paper, we seek to review the bug-fixing capabilities of ChatGPT on a clean APR benchmark with different research objectives. We first introduce {\benchmark}, a new benchmark with buggy and the corresponding fixed programs from competitive programming problems starting from 2023, after the training cutoff point of ChatGPT. The results on {\benchmark} show that ChatGPT is able to fix 109 out of 151 buggy programs using the basic prompt within 35 independent rounds, outperforming state-of-the-art LLMs CodeT5 and PLBART by 27.5\% and 62.4\% prediction accuracy. We also investigate the impact of three types of prompts, i.e., problem description, error feedback, and bug localization, leading to additional 34 fixed bugs. Besides, we provide additional discussion from the interactive nature of ChatGPT to illustrate the capacity of a dialog-based repair workflow with 9 additional fixed bugs. Inspired by the findings, we further pinpoint various challenges and opportunities for advanced SE study equipped with such LLMs (e.g.,~ChatGPT) in the near future. More importantly, our work calls for more research on the reevaluation of the achievements obtained by existing black-box LLMs across various SE tasks, not limited to ChatGPT on APR

    Backdooring Neural Code Search

    Full text link
    Reusing off-the-shelf code snippets from online repositories is a common practice, which significantly enhances the productivity of software developers. To find desired code snippets, developers resort to code search engines through natural language queries. Neural code search models are hence behind many such engines. These models are based on deep learning and gain substantial attention due to their impressive performance. However, the security aspect of these models is rarely studied. Particularly, an adversary can inject a backdoor in neural code search models, which return buggy or even vulnerable code with security/privacy issues. This may impact the downstream software (e.g., stock trading systems and autonomous driving) and cause financial loss and/or life-threatening incidents. In this paper, we demonstrate such attacks are feasible and can be quite stealthy. By simply modifying one variable/function name, the attacker can make buggy/vulnerable code rank in the top 11%. Our attack BADCODE features a special trigger generation and injection procedure, making the attack more effective and stealthy. The evaluation is conducted on two neural code search models and the results show our attack outperforms baselines by 60%. Our user study demonstrates that our attack is more stealthy than the baseline by two times based on the F1 score

    Green investing in China's air cargo industry: Opportunities and challenges for sustainable transportation

    Get PDF
    © 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license.Aviation cargo remains vital in the economic activities to transported goods from one place to another. The developed and developing countries mainly consider the transaction routes for air transportation for safe and quickest mode. Chinese economy is attracting the global World through its exports. The country's air cargo system is mainly reliant on gasoline and petroleum-based fuels, which harms the country's green transportation agenda. The high use of fuel combustions in the aviation sector needed greenfield investment that helps to use green energy as an alternative sustainable fuel. Further, sustainable aviation insurance and financial coverage are needed to mitigate the adverse negative externalities from air cargo operations. Based on the crucial facts, the study used air cargo operations, transportation fuel combustions, private investment in the transportation and insurance coverage in the pollution damage function for the China economy using data from 1975 to 2020. The research employed a non-linear ARDL Bounds testing strategy to break down the sequence of variables into dynamic positive and negative multipliers. Positive shocks in air freight, insurance services, and greenfield investment have been shown to reduce carbon emissions immediately and over the long term. In the short term, carbon damages are exacerbated by the negative shocks resulting from the use of transportation fuel and the availability of insurance. Moreover, both the positive and negative shocks associated with transportation fuel combustions and air transportation freights contribute to a rise in carbon damage. The variance decomposition analysis validated the asymmetric correlations between the aforementioned variables in the intertemporal environment. Based on the findings, negative shocks from total fuel combustions are expected to impose the greatest carbon damages over the next decade, followed by insurance services and air freight operations. The study concludes that air cargo operations need to be sustainable transacting routes fueled by biofuel energy sources, greenfield investment, and sustainable aviation insurance coverage to achieve the ‘green is clean’ transportation agenda.publishedVersio

    S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

    Full text link
    VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language. Unlike traditional attention or gradient analysis, free-text rationales can be easier to understand and gain users' trust. Existing methods mostly use post-hoc or self-rationalization models to obtain a plausible explanation. However, these frameworks are bottlenecked by the following challenges: 1) the reasoning process cannot be faithfully responded to and suffer from the problem of logical inconsistency. 2) Human-annotated explanations are expensive and time-consuming to collect. In this paper, we propose a new Semi-Supervised VQA-NLE via Self-Critical Learning (S3C), which evaluates the candidate explanations by answering rewards to improve the logical consistency between answers and rationales. With a semi-supervised learning framework, the S3C can benefit from a tremendous amount of samples without human-annotated explanations. A large number of automatic measures and human evaluations all show the effectiveness of our method. Meanwhile, the framework achieves a new state-of-the-art performance on the two VQA-NLE datasets.Comment: CVPR202

    A Survey of Source Code Search: A 3-Dimensional Perspective

    Full text link
    (Source) code search is widely concerned by software engineering researchers because it can improve the productivity and quality of software development. Given a functionality requirement usually described in a natural language sentence, a code search system can retrieve code snippets that satisfy the requirement from a large-scale code corpus, e.g., GitHub. To realize effective and efficient code search, many techniques have been proposed successively. These techniques improve code search performance mainly by optimizing three core components, including query understanding component, code understanding component, and query-code matching component. In this paper, we provide a 3-dimensional perspective survey for code search. Specifically, we categorize existing code search studies into query-end optimization techniques, code-end optimization techniques, and match-end optimization techniques according to the specific components they optimize. Considering that each end can be optimized independently and contributes to the code search performance, we treat each end as a dimension. Therefore, this survey is 3-dimensional in nature, and it provides a comprehensive summary of each dimension in detail. To understand the research trends of the three dimensions in existing code search studies, we systematically review 68 relevant literatures. Different from existing code search surveys that only focus on the query end or code end or introduce various aspects shallowly (including codebase, evaluation metrics, modeling technique, etc.), our survey provides a more nuanced analysis and review of the evolution and development of the underlying techniques used in the three ends. Based on a systematic review and summary of existing work, we outline several open challenges and opportunities at the three ends that remain to be addressed in future work.Comment: submitted to ACM Transactions on Software Engineering and Methodolog

    Edge-centric queries stream management based on an ensemble model

    Get PDF
    The Internet of things (IoT) involves numerous devices that can interact with each other or with their environment to collect and process data. The collected data streams are guided to the cloud for further processing and the production of analytics. However, any processing in the cloud, even if it is supported by improved computational resources, suffers from an increased latency. The data should travel to the cloud infrastructure as well as the provided analytics back to end users or devices. For minimizing the latency, we can perform data processing at the edge of the network, i.e., at the edge nodes. The aim is to deliver analytics and build knowledge close to end users and devices minimizing the required time for realizing responses. Edge nodes are transformed into distributed processing points where analytics queries can be served. In this paper, we deal with the problem of allocating queries, defined for producing knowledge, to a number of edge nodes. The aim is to further reduce the latency by allocating queries to nodes that exhibit low load (the current and the estimated); thus, they can provide the final response in the minimum time. However, before the allocation, we should decide the computational burden that a query will cause. The allocation is concluded by the assistance of an ensemble similarity scheme responsible to deliver the complexity class for each query. The complexity class, thus, can be matched against the current load of every edge node. We discuss our scheme, and through a large set of simulations and the adoption of benchmarking queries, we reveal the potentials of the proposed model supported by numerical results
    • …
    corecore